independence system
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Europe > Finland (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Europe > Finland (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (2 more...)
- Asia > China (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > Canada (0.04)
Distinguishability of causal structures under latent confounding and selection
Carey, Ryan, Ansanelli, Marina Maciel, Wolfe, Elie, Evans, Robin J.
Statistical relationships in observed data can arise for several different reasons: the observed variables may be causally related, they may share a latent common cause, or there may be selection bias. Each of these scenarios can be modelled using different causal graphs. Not all such causal graphs, however, can be distinguished by experimental data. In this paper, we formulate the equivalence class of causal graphs as a novel graphical structure, the selected-marginalized directed graph (smDG). That is, we show that two directed acyclic graphs with latent and selected vertices have the same smDG if and only if they are indistinguishable, even when allowing for arbitrary interventions on the observed variables. As a substitute for the more familiar d-separation criterion for DAGs, we provide an analogous sound and complete separation criterion in smDGs for conditional independence relative to passive observations. Finally, we provide a series of sufficient conditions under which two causal structures are indistinguishable when there is only access to passive observations.
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Denmark > North Jutland > Aalborg (0.04)
- Health & Medicine (0.67)
- Government (0.45)
Optimal Sampling Gaps for Adaptive Submodular Maximization
Running machine learning algorithms on large and rapidly growing volumes of data are often computationally expensive, one common trick to reduce the size of a data set, and thus reduce the computational cost of machine learning algorithms, is \emph{probability sampling}. It creates a sampled data set by including each data point from the original data set with a known probability. Although the benefit of running machine learning algorithms on the reduced data set is obvious, one major concern is that the performance of the solution obtained from samples might be much worse than that of the optimal solution when using the full data set. In this paper, we examine the performance loss caused by probability sampling in the context of adaptive submodular maximization. We consider a easiest probability sampling method which selects each data point independently with probability $r\in[0,1]$. We define sampling gap as the largest ratio of the optimal solution obtained from the full data set and the optimal solution obtained from the samples, over independence systems. Our main contribution is to show that if the utility function is policywise submodular, then for a given sampling rate $r$, the sampling gap is both upper bounded and lower bounded by $1/r$. One immediate implication of our result is that if we can find an $\alpha$-approximation solution based on a sampled data set (which is sampled at sampling rate $r$), then this solution achieves an $\alpha r$ approximation ratio for the original problem when using the full data set. We also show that the property of policywise submodular can be found in a wide range of real-world applications, including pool-based active learning and adaptive viral marketing.
Streaming Non-Monotone Submodular Maximization: Personalized Video Summarization on the Fly
Mirzasoleiman, Baharan (ETH Zurich) | Jegelka, Stefanie (MIT) | Krause, Andreas (ETH Zurich)
The need for real time analysis of rapidly producing data streams (e.g., video and image streams) motivated the design of streaming algorithms that can efficiently extract and summarize useful information from massive data "on the fly." Such problems can often be reduced to maximizing a submodular set function subject to various constraints. While efficient streaming methods have been recently developed for monotone submodular maximization, in a wide range of applications, such as video summarization, the underlying utility function is non-monotone, and there are often various constraints imposed on the optimization problem to consider privacy or personalization. We develop the first efficient single pass streaming algorithm, Streaming Local Search, that for any streaming monotone submodular maximization algorithm with approximation guarantee α under a collection of independence systems I, provides a constant 1/(1+2/√α+1/α+2d(1+√α)) approximation guarantee for maximizing a non-monotone submodular function under the intersection of I and d knapsack constraints. Our experiments show that for video summarization, our method runs more than 1700 times faster than previous work, while maintaining practically the same performance.
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Vision (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.50)
Utilitarians Without Utilities: Maximizing Social Welfare for Graph Problems Using Only Ordinal Preferences
Abramowitz, Ben (Rensselaer Polytechnic Institute) | Anshelevich, Elliot (Rensselaer Polytechnic Institute)
We consider ordinal approximation algorithms for a broad class of utility maximization problems for multi-agent systems. In these problems, agents have utilities for connecting to each other, and the goal is to compute a maximum-utility solution subject to a set of constraints. We represent these as a class of graph optimization problems, including matching, spanning tree problems, TSP, maximum weight planar subgraph, and many others. We study these problems in the ordinal setting: latent numerical utilities exist, but we only have access to ordinal preference information, i.e., every agent specifies an ordering over the other agents by preference. We prove that for the large class of graph problems we identify, ordinal information is enough to compute solutions which are close to optimal, thus demonstrating there is no need to know the underlying numerical utilities. For example, for problems in this class with bounded degree b a simple ordinal greedy algorithm always produces a (b + 1)-approximation; we also quantify how the quality of ordinal approximation depends on the sparsity of the resulting graphs. In particular, our results imply that ordinal information is enough to obtain a 2-approximation for Maximum Spanning Tree; a 4-approximation for Max Weight Planar Subgraph; a 2-approximation for Max-TSP; and a 2- approximation for various Matching problems.